Sparse inverse kernel Gaussian Process regression
نویسندگان
چکیده
Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and finances. Gaussian Process regression is a popular technique for modeling the input-output relations of a set of variables under the assumption that the weight vector has a Gaussian prior. However, it is challenging to apply Gaussian Process regression to large data sets since prediction based on the learned model requires inversion of an order n kernel matrix. Approximate solutions for sparse Gaussian Processes have been proposed for sparse problems. However, in almost all cases, these solution techniques are agnostic to the input domain and do not preserve the similarity structure in the data. As a result, although these solutions sometimes provide excellent accuracy, the models do not have interpretability. Such interpretable sparsity patterns are very important for many applications. We propose a new technique for sparse Gaussian Process regression that allows us to compute a parsimonious model while preserving the interpretability of the sparsity structure in the data. We discuss how the inverse kernel matrix used in Gaussian Process prediction gives valuable domain information and then adapt the inverse covariance estimation from Gaussian graphical models to estimate the Gaussian kernel. We solve the optimization problem using the alternating direction method of multipliers that is amenable to parallel computation. We compare the performance of this algorithm to different existing methods for sparse covariance regression in terms of both speed and accuracy. We demonstrate the performance of our method in terms of accuracy, scalability and interpretability on two different satellite data sets from the climate domain.
منابع مشابه
Sparse Inverse Gaussian Process Regression with Application to Climate Network Discovery
Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and finances. Gaussian Process regression is a popular technique for modeling the input-output relations of a set of variables under the assumption that the weight vector has a Gaussian prior. However, it is challenging to apply Gaussian Process regression to lar...
متن کاملMultiple Gaussian Process Models
We consider a Gaussian process formulation of the multiple kernel learning problem. The goal is to select the convex combination of kernel matrices that best explains the data and by doing so improve the generalisation on unseen data. Sparsity in the kernel weights is obtained by adopting a hierarchical Bayesian approach: Gaussian process priors are imposed over the latent functions and general...
متن کاملVariational Model Selection for Sparse Gaussian Process Regression
Sparse Gaussian process methods that use inducing variables require the selection of the inducing inputs and the kernel hyperparameters. We introduce a variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood. The key property of this formulation is that the inducing i...
متن کاملBatch Continuous-Time Trajectory Estimation as Exactly Sparse Gaussian Process Regression
In this paper, we revisit batch state estimation through the lens of Gaussian process (GP) regression. We consider continuous-discrete estimation problems wherein a trajectory is viewed as a one-dimensional GP, with time as the independent variable. Our continuous-time prior can be defined by any linear, time-varying stochastic differential equation driven by white noise; this allows the possib...
متن کاملA Gradient-based Forward Greedy Algorithm for Sparse Gaussian Process Regression
In this chaper, we present a gradient-based forward greedy method for sparse approximation of Bayesian Gaussian Process Regression (GPR) model. Different from previous work, which is mostly based on various basis vector selection strategies, we propose to construct instead of select a new basis vector at each iterative step. This idea was motivated from the well-known gradient boosting approach...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Statistical Analysis and Data Mining
دوره 6 شماره
صفحات -
تاریخ انتشار 2013